81 research outputs found

    Statistical Inference and Computational Methods for Large High-Dimensional Data with Network Structure.

    Full text link
    New technological advancements have allowed collection of datasets of large volume and different levels of complexity. Many of these datasets have an underlying network structure. Networks are capable of capturing dependence relationship among a group of entities and hence analyzing these datasets unearth the underlying structural dependence among the individuals. Examples include gene regulatory networks, understanding stock markets, protein-protein interaction within the cell, online social networks etc. The thesis addresses two important aspects of large high-dimensional data with network structure. The first one focuses on a high-dimensional data with network structure that evolves over time. Examples of such data sets include time course gene expression data, voting records of legislative bodies etc. The main task is to estimate the change-point as well as the network structures prior and post it. The network structures are obtained by penalized optimization method and we establish a finite sample estimation error bound for the change-point in the high-dimensional regime. The other aspect that we examine is about parameter estimation in large heterogeneous data with network structure. Our primary goal is to develop efficient computational techniques based on random subsampling and parallelization to estimate the parameters. We provide an analysis of rate of decay of bias and variance of our parallel implementation with a single round of communication after every iteration. We further show two applications of our methodology in the case of Gaussian Mixture Model (GMM) and Stochastic Block Model (SBM).The emphasis is placed on developing new theoretical techniques and computational tools for network problems and applying the corresponding methodology in many fields, including biomedical and social science research, where network modeling and analysis plays an exceedingly important role.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113602/1/sandipan_1.pd

    Consistent Multiple Change-point Estimation with Fused Gaussian Graphical Models

    Get PDF

    Likelihood Inference for Large Scale Stochastic Blockmodels with Covariates based on a Divide-and-Conquer Parallelizable Algorithm with Communication

    Get PDF
    We consider a stochastic blockmodel equipped with node covariate information, that is helpful in analyzing social network data. The key objective is to obtain maximum likelihood estimates of the model parameters. For this task, we devise a fast, scalable Monte Carlo EM type algorithm based on case-control approximation of the log-likelihood coupled with a subsampling approach. A key feature of the proposed algorithm is its parallelizability, by processing portions of the data on several cores, while leveraging communication of key statistics across the cores during each iteration of the algorithm. The performance of the algorithm is evaluated on synthetic data sets and compared with competing methods for blockmodel parameter estimation. We also illustrate the model on data from a Facebook derived social network enhanced with node covariate information.Comment: 28 pages, 4 figure

    Bayesian Inference in Nonparametric Dynamic State-Space Models

    Get PDF
    We introduce state-space models where the functionals of the observational and the evolutionary equations are unknown, and treated as random functions evolving with time. Thus, our model is nonparametric and generalizes the traditional parametric state-space models. This random function approach also frees us from the restrictive assumption that the functional forms, although time-dependent, are of fixed forms. The traditional approach of assuming known, parametric functional forms is questionable, particularly in state-space models, since the validation of the assumptions require data on both the observed time series and the latent states; however, data on the latter are not available in state-space models. We specify Gaussian processes as priors of the random functions and exploit the "look-up table approach" of \ctn{Bhattacharya07} to efficiently handle the dynamic structure of the model. We consider both univariate and multivariate situations, using the Markov chain Monte Carlo (MCMC) approach for studying the posterior distributions of interest. In the case of challenging multivariate situations we demonstrate that the newly developed Transformation-based MCMC (TMCMC) of \ctn{Dutta11} provides interesting and efficient alternatives to the usual proposal distributions. We illustrate our methods with a challenging multivariate simulated data set, where the true observational and the evolutionary equations are highly non-linear, and treated as unknown. The results we obtain are quite encouraging. Moreover, using our Gaussian process approach we analysed a real data set, which has also been analysed by \ctn{Shumway82} and \ctn{Carlin92} using the linearity assumption. Our analyses show that towards the end of the time series, the linearity assumption of the previous authors breaks down.Comment: This version contains much greater clarification of the look-up table idea and a theorem regarding this is also proven and included in the supplement. Will appear in Statistical Methodolog

    ERDOSTEINE: AN EFFECTIVE ANTIOXIDANT FOR PROTECTING COMPLETE FREUND’S ADJUVANT INDUCED ARTHRITIS IN RATS

    Get PDF
    Objective: The objective of this study was to evaluate the protective effect of Erdosteine on complete freund’s adjuvant (CFA) induced arthritic rats. Methods: Wistar Albino rats of 100–250 g were divided into five groups (n=6) and administered with 0.1 ml of CFA subcutaneously into the left hind paw except the negative control group. The standard group received methotrexate (MTX) 0.075 mg/kg body weight orally. Besides, the test groups received Erdosteine orally at a dose 10 mg/kg and 20 mg/kg bodyweight for 12 days. The changes in body weight, paw volume, hematological parameters, radiographical, and histological findings were the indicators to evaluate the efficacy of the test product. Discussion: Significant change in the body weight, paw volume, radiographical, hematological, and histological parameters were observed which supports the remarkable reduction of the arthritic development in the standard and test groups compared to the untreated group. However, the test group (Erdosteine) with the dose 20 mg/kg shows to be more potent than the test group (Erdosteine) with a dose 10 mg/kg and the standard group (MTX) to reduce the arthritic effect. Results: The test group with 20 mg/kg Erdosteine showed much better outcome than the standard group at significant (p<0.05). Therefore, Erdosteine acting as an anti-inflammatory and anti-oxidant is effective at a dose 20 mg/kg in treating the progression of rheumatoid arthritis in rats

    Change-Point Estimation in High-Dimensional Markov Random Field Models

    Get PDF
    The paper investigates a change point estimation problem in the context of high dimensional Markov random-field models. Change points represent a key feature in many dynamically evolving network structures. The change point estimate is obtained by maximizing a profile penalized pseudolikelihood function under a sparsity assumption. We also derive a tight bound for the estimate, up to a logarithmic factor, even in settings where the number of possible edges in the network far exceeds the sample size. The performance of the estimator proposed is evaluated on synthetic data sets and is also used to explore voting patterns in the US Senate in the 1979-2012 period

    IN-VITRO STUDY ON THE HEMOLYTIC ACTIVITY OF DIFFERENT EXTRACTS OF INDIAN MEDICINAL PLANT CROTON BONPLANDIANUM WITH PHYTOCHEMICAL ESTIMATION: A NEW ERA IN DRUG DEVELOPMENT

    Get PDF
    In this study different extracts of the leaves of Croton bonplandianum were screened for the haemolytic activity towards human erythrocytes. The haemolytic activity was performed by modified spectroscopic method at four different concentrations (300, 150, 75, 25 μg/ml). The haemolytic activity of the different extracts of Croton bonplandianum was found in the following order Ethyl acetate extract > Chloroform extract > Benzene extract. However, all the extracts alone and in combination with each other exhibited very low haemolytic activity. E. ganitrus did not exhibit any haemolytic activity at any dilution. Hence, they can be considered as safe to human erythrocytes. Keywords: Hemolytic activity, Croton bonplandianum, Erythrocyte

    Interpretable brain age prediction using linear latent variable models of functional connectivity

    Get PDF
    Neuroimaging-driven prediction of brain age, defined as the predicted biological age of a subject using only brain imaging data, is an exciting avenue of research. In this work we seek to build models of brain age based on functional connectivity while prioritizing model interpretability and understanding. This way, the models serve to both provide accurate estimates of brain age as well as allow us to investigate changes in functional connectivity which occur during the ageing process. The methods proposed in this work consist of a two-step procedure: first, linear latent variable models, such as PCA and its extensions, are employed to learn reproducible functional connectivity networks present across a cohort of subjects. The activity within each network is subsequently employed as a feature in a linear regression model to predict brain age. The proposed framework is employed on the data from the CamCAN repository and the inferred brain age models are further demonstrated to generalize using data from two open-access repositories: the Human Connectome Project and the ATR Wide-Age-Range.Peer reviewe
    corecore